Threshold Selection in Feature Screening for Error Rate Control
نویسندگان
چکیده
Hard thresholding rule is commonly adopted in feature screening procedures to screen out unimportant predictors for ultrahigh-dimensional data. However, different thresholds are required adapt contexts of problems and an appropriate magnitude usually varies from the model error distribution. With ad-hoc choice, it unclear whether all important selected or not, very likely that would include many features. We introduce a data-adaptive threshold selection procedure with rate control, which applicable most kinds popular methods. The key idea apply sample-splitting strategy construct series statistics marginal symmetry property then utilize obtaining approximation number false discoveries. show proposed method able asymptotically control discovery per family under certain conditions still retains predictors. Three examples presented illustrate merits new procedures. Numerical experiments indicate methodology works well existing Supplementary materials this article available online.
منابع مشابه
H-BwoaSvm: A Hybrid Model for Classification and Feature Selection of Mammography Screening Behavior Data
Breast cancer is one of the most common cancer in the world. Early detection of cancers cause significantly reduce in morbidity rate and treatment costs. Mammography is a known effective diagnosis method of breast cancer. A way for mammography screening behavior identification is women's awareness evaluation for participating in mammography screening programs. Todays, intelligence systems could...
متن کاملMinimum Bayes error feature selection
We consider the problem of designing a linear transformation 2 IR , of rank p n, which projects the features of a classi er x 2 IR onto y = x 2 IR such as to achieve minimum Bayes error (or probability of misclassi cation). Two avenues will be explored: the rst is to maximize the -average divergence between the class densities and the second is to minimize the union Bhattacharyya bound in the r...
متن کاملFast SFFS-Based Algorithm for Feature Selection in Biomedical Datasets
Biomedical datasets usually include a large number of features relative to the number of samples. However, some data dimensions may be less relevant or even irrelevant to the output class. Selection of an optimal subset of features is critical, not only to reduce the processing cost but also to improve the classification results. To this end, this paper presents a hybrid method of filter and wr...
متن کاملProposed Feature Selection for Dynamic Thermal Management in Multicore Systems
Increasing the number of cores in order to the demand of more computing power has led to increasing the processor temperature of a multi-core system. One of the main approaches for reducing temperature is the dynamic thermal management techniques. These methods divided into two classes, reactive and proactive. Proactive methods manage the processor temperature, by forecasting the temperature be...
متن کاملImpact of error estimation on feature selection
Given a large set of potential features, it is usually necessary to find a small subset with which to classify. The task of finding an optimal feature set is inherently combinatoric and therefore suboptimal algorithms are typically used to find feature sets. If feature selection is based directly on classification error, then a feature-selection algorithm must base its decision on error estimat...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of the American Statistical Association
سال: 2022
ISSN: ['0162-1459', '1537-274X', '2326-6228', '1522-5445']
DOI: https://doi.org/10.1080/01621459.2021.2011735